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Abstract 

• i— I 

We present a mathematical analysis of the speciation model for food-web structure, which had in 
previous work been shown to yield a good description of empirical data of food-web topology. The 
degree distributions of the network are derived. Properties of the speciation model are compared to 
those of other models that successfully describe empirical data. It is argued that the speciation model 
unifies the underlying ideas of previous theories. In particular, it offers a mechanistic explanation for 
the success of the niche model of Williams and Martinez and the frequent observation of intervality in 
empirical food webs. 
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1 Introduction 



The theoretical study of the topology of food webs, the networks formed by the trophic interactions in 
ecological communities, has led to increasingly precise descriptions of the empirically observed structures. 
In the early work of Cohen (1978), Briand and Cohen (1987), Sugihara (1983) and others, several simple 
food-web models had been investigated. The cascade model ( |Cohen et al. , 1990) was identified as a 
description that reproduced the available data particularly well. In the cascade model a food web consists 
of a fixed number S of species, and each species consumes any species which precedes it in a given linear 



and Raffaclli, 1991, Havens, 1992, Martinez, 1991, Polis, 1991) 



Williams and Martinez 



ordering with a fixed probability Co- The analysis of this model led to several predictions ( Cohen ct al 
1990]) which inspired a more systematic and accurate collection of food- web data by empiricists (e.g., Hall 



( 2000 ) showed that their niche model was a significant 



Based on the new data, 

improvement. In this model, species are ordered according to their niche value n that is chosen randomly 
from the interval [0, 1]. To determine the diet of a species, an interval of random width < n is drawn with 
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even distribution from within 1 [0, 1], restricted by the condition that at least half of the interval is located 
below the niche value n of this species. Its diet then consists of all species with a niche value in this interval. 

A mathematical analysis by Camacho et al. ( 2002a| ) revealed the importance of the specific rule for 
determining the width of the feeding intervals: by choosing it from an approximately exponential 
distribution, the resulting food webs show a distribution of generality (the number of a species' resources) 
which is strongly skewed towards low values, in good accordance with observations (Camacho et al., 2002t| , 



Stouffer et al. , 2005) 



By construction, the niche model also reproduces a property called intervality ( pohen , 1978): Species 
can be ordered on a line in such a way that the diet of each consumer is a contiguous set. Intervality is 



surpr i singly often f ound in small webs (Cohen , 1978). Larger webs exhibit it to some degree ( Cattin et al _ 
2004 ICohen et al] , |l990|) . Cattin et al. (E004D argued that intervality can be a consequence of the fact that 
similar, evolutionary related species consume similar resources. They proposed the nested hierarchy model, 
a modification of the niche model which incorporates this idea and better accounts for the observed degree 
of intervality. 

Apart from these mostly descriptive models of food-web topology there have also been several attempts 

to explain the structure of food webs by the interaction of population dynamical and evolutionary 

mechanisms (e.g., Caldarelli et al. , 1998 , Drossel et al. , [2001 , Tokita and Yasutomi , 2003| . Yoshida, 2003| ). 
Characteristic for most of these models is their high computational complexity, which makes their 
quantitative statistical validation difficult. Therefore it can be advantageous to consider first explanatory 
models that are explicit in terms of either population dynamics (e.g., Montoya and Sole, 2003, Pimm, 1984) 
or evolutionary mechanisms (e.g., Amaral and Meyer, 1999, Camacho and Sole, 200C, Drossel, 1998) alone. 

The recently proposed speciation model ( [Rossbcrg ct aL , 2005) is of the purely evolutionary type. It 
combines mechanisms corresponding to speciations and extinctions with simple assumptions regarding the 
evolutionary inheritance of trophic links. In spirit, the model is similar to the duplication-divergence model 
of proteome evolution by Vazquez et al. (2003) or the related model by Pastor-Satorras et al. ( 2003| ), even 
though in the speciation model directed links and the possibility of extinctions complicate the situation. 

Furthermore, the speciation model takes the tendency of food webs to respect a "pecking order" , as it is 
ideally realized in the cascade model, into account. It is currently unclear if the dominating mechanism 
imposing this ordering of species is the physical advantage that larger predators have over smaller prey, 
energy conservation and dissipation, or some other constraint. The idea that the pecking order is 



essentially an ordering by body size has often been discussed (Cattin et al., 2004, Cohen et al. , 1993 



Memmott et al., 2000, Warren and Lawton, 1987). The speciation model makes this hypothesis explicit by 



postulating an allometric relationship between body sizes and evolution rates. 

The speciation model has been validated by a systematic statistical analysis based on a comparison of 
twelve model properties — such as the average chain length, t he fraction of top pred ators, the degree of 
intervality, or the clustering coefficient — with empirical data ( Rossbcrg et al. , 2005 ). These numerical 



results suggest that the speciation model reproduces observed food-web properties even better than the 
niche model or the nested hierarchy model. The aim of the current work is to present some analytic results 
that allow insights into how important food web properties derive from the model specifications. After 
stating the model definition in Sec. |^, the steady-state distribution of the number of species S and the 
expectation value of the directed connectance C (sometimes referred to as the food- web "complexity" ) are 



derived in Sec. t 
models. Section 



These quantities are important because they are used as control parameters in other 
also contains a characterization of the species pool in terms of evolutionary "clades" 
which invites a comparison with empirical data. Section |J is devoted to a characterization of the model in 
terms of the distributions of generality and vulnerability (the number of a species' consumers). Based on 
these results, the speciation model is compared with the cascade model, the niche model, and the nested 
hierarchy model in Sec. |^; common properties and differences are pointed out. Two variants of the 
speciation model, which leave the analytic properties derived below unchanged, are introduced in Sec. g. A 
discussion and interpretation of the results is provided in Sec. ^. 

1 The original description of the model (Williams and Martinez, [zOOtj) is inaccurate at this point. 
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2 Definition of the speciation model 



This section restates the definition of the speciation model given elsewhere (Rossberg et al., 2005| ), since it 



will be the starting point for the subsequent analysis. For a motivation of the model and a discussion of 
design decisions we refer to the original work. The speciation model describes an abstract species pool, the 
set of trophic links between the species, and the evolution of both. The model is described in terms of a 
stochastic process characterized by the parameters r%, r + , r_, R, D, A, Co, and (3. 

2.1 The evolution of the species pool 

Each species i in the pool is associated with a speed parameter Si in the range [0, R]. The speed parameter 
characterizes the evolution rate of a species and is thought to be inversely correlated with the logarithm of 



the species' body mass by an allometric law (see Rossberg et al. (2005) for discussion). In any infinitesimal 



time interval [t, t + dt] three kinds of events can occur: adaptations of foreign species to the habitat (i.e. 
invasions on an evolutionary time scale), extinctions, and speciations. The probability for the adaptation of 
a new species k with speed parameter in the infinitesimal range Sk G [s, s + ds] is r\ exp(s) dsdt. When a 
new species is adapting to the habitat, it is added to the species pool. The probability that some species i 
of the species pool becomes extinct is r_ exp(sj) dt. When a species becomes extinct, it is removed from 
the species pool. Finally, the probability that some species i from the species pool speciates is 
r+ exp(sj) dt. When i speciates, a new species j with speed parameter Sj = si + S is added to the species 
pool, where S is a zero-mean Gaussian random variable with var(5 = D. If Si + S exceeds the range [0, R], 
Sj = — (s, + 8) or Sj = 2R — (s, + 8) are used instead (reflecting boundaries). The probabilities for any of 
these events to occur are independent. 

2.2 The evolution of the food web 

The food web is described by a connectivity (or adjacency) matrix (rriij), with connectivity values rriij = 1 
when j eats i and = otherwise. Possible consumers I of species i are defined as species with 
si < Si + A R, possible resources h as those with Sh > Si — A R. The connectivity my can be 1 only when i 
is a possible resource of j. The connectivity of a new species adapting to the habitat to all possible 
consumers and resources is set to 1 with probability Co and to otherwise. Upon speciation, the 
connectivity values of the decedent species j to possible consumers and resources are copied from the 
corresponding connectivity values of the parent species i with probability 1 — (3 (i.e., links break with 
probability (3). The connectivity values to all possible resources and consumers of j which have not been 
copied are set to 1 with probability Co and to otherwise. 

2.3 Typical parameters 



In our previous study ( Rossberg et al. , 2005 ) the predictions of the speciation model were compared to 
empirical data, and maximum likelihood fits of the model to empirical data sets for fixed R = In 10 4 , 
D = 0.0025, r_ = 1 were computed. For brevity we refer to these parameter sets as "typical values" 
hereafter. For the convenience of the reader the fitted values are listed in Table [j] together with some 
derived expressions relevant for the calculations below. 



3 Basic statistical properties of the model steady state 



The number S of species in a food web and the number L of trophic links connecting them belong to the 
simplest quantities used to characterize food webs. Often L is expressed in terms of the directed 
connectance C — L/S 2 or related quantities. In what follows, the steady-state distribution of S and the 
expectation value of C for the speciation model are derived. For these calculations, it is helpful to imagine 
the species pool as being divided into clades. Following Yoshida (2002, 2003| ), a clade is here defined as the 
group of all currently existing descendant species of a founder species that entered the species pool through 
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Food web: 


BB 


Sk 


Co 


Ch 


SM 


Yth 


LR 


model parameters: 
















T + (= P) 


0.914 


0.934 


0.961 


0.959 


0.801 


0.949 


0.991 


n 


0.17 


0.21 


0.13 


0.21 


0.92 


0.67 


0.13 


A 


0.12 


0.082 


0.006 


0.25 





0.001 


0.025 


Co 


0.37 


0.53 


0.58 


0.064 


0.23 


0.081 


0.16 


P 


0.059 


0.012 


0.014 


0.029 


0.034 


0.040 


0.0063 


derived quantities: 
















web size (before lumping) (S) 


18.2 


29.0 


31.4 


47.9 


42.7 


122.0 


137.4 


varS/(S) 2 


0.64 


0.53 


0.81 


0.51 


0.12 


0.16 


0.81 


clade size (n): Eq. (pH) 


4.3 


5.2 


7.6 


7.3 


2.5 


6.3 


23.5 


number of clades (c): Eq. ([l5|) 


4.2 


5.5 


4.2 


6.6 


17.1 


19.5 


5.8 


clade lifetime in gen.: — ln(l — p) 


2.5 


2.7 


3.2 


3.2 


1.6 


3.0 


4.7 


clades in diet: Eq. (|33|). A = R 


2.3 


3.2 


2.8 


0.7 


4.5 


2.8 


1.5 


diet breakout: Eq. (p5^ 


0.44 


0.16 


0.31 


0.41 


0.12 


0.43 


0.44 



Table 1: Maximum- likelihood model parameters for the speciation model obtained for seven empirical food 
webs and quantities derived thereof. The abbreviations stand for BB: Bridge Brook Lake ( Havens], 1992), 
Sk: Skipw i th Po nd ( |Warren| , |1989| ), Co: Coachella Desert (|Polis|, |l99l|) Ch: Chesapeake Bay ( [Bah'd and 
Ulanowicz|, [1989D, SM: St. Martin Island flGoldwasser and Roughgarden| , |1993| ), Yth: Ythan Estuary ((Hall 
and Raffaclli| , |l99l| ), LR: Little Rock Lake ( |Martinez| , |l99l[ ). 



an adaptation process, in close correspondence with the standard phylogenetic notion. When D is 
sufficiently small, the speed parameter s is approximately the same for all species in a clades, and the 
ranges of s covered by different clades do not overlap. We can then divide the s axis into small intervals 
[s, s + As], and account for the number of species in each interval separately. The absence of overlap 
between clades is used only as a trick to simplify accounting. The final results do not depend on this 
assumption. The condition that the spread of s within clades is small will be made more precise in the 
detailed discussion of the clades in Section [T^ below. 

3.1 The steady-state distribution of the species number S 

In order to obtain the steady-state distribution of the total number of species, consider first only a small 
interval [s, s + As] on the speed-parameter axis. The master equation for the probability distribution p n of 
the number n of species in the interval is given by 



for n > 1 and 



dpn 
dt 



dt 



-Jn — l,n Jn,n+1 (i) 



=3o,i, (2) 



with the probability current j n ^ n+ i, resulting from the balance of processes incrementing and 
decrementing n, given by 

j n ,n+i =e s [(nr + + riAs)p n - (n + l)r_p n +i] • ( 3 ) 

The possibility of speciations that cross the boundaries of the range [s, s + As] is ignored here, because the 
corresponding corrections would cancel out when summing up the n values from different intervals below. 
The reflecting boundary conditions at the endpoints of the full s-range [0, R] ensure that (||) holds also for 
the intervals adjacent to the endpoints. 
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Figure 1: Typical steady-state distribution of the number of species S. The solid line is P(kR, p; S) as 
defined by Eq. ( |A.4| ); the histogram was obtained by direct simulations. Parameters correspond to Bridge 
Brook Lake (Tab~||. 

For the steady state j n ,n+i — one gets 

Pi = — po As (4) 

r_ 



and for n > 1 the recursive relation 



Pn+l = , ™+ Pn + O(As), (5) 

(n + l)r_ 

which is solved by 

p n = l -( r -±X r ^s + 0{^). (6) 

n \ r_ J r + 

With the abbreviations p = r + /?'_ and n — r\/r + , the corresponding moment generating function is 

m(z) = (z n )= Po [1-kAs ln(l-pz)] + C(As 2 ), (7) 

with 

po = 1 + k Asln(l - p) + C(As 2 ) (8) 
given by the normalization condition m(l) = 1. From m(z) one obtains the cumulant generating function 

k(z) = lnm(z) =lnp - k A.sln(l - pz) + C(As 2 ) 

\l-pzj 

Cumulant generating functions of this form and the corresponding distributions are discussed in 
Appendix |A|. For example, by Eq. (A. 2), the density of species along the speed-parameter line is 

km M = = ^L_. (10 ) 

As^o As 1 — p r_ — r + 

The cumulant generating function of the sum of independent random variables is the sum of their 
cumulant generating functions. Thus, the cumulant generating function for the total number of species 5* 
can be obtained by dividing the range [0, R] into small intervals of width As and summing the 



G 



contributions. With As — > corrections 0(As 2 ) become negligible and the summation goes over into an 
integration: 



, -r- -As + 0(As 2 ) -» / Kin ^- )ds = KR]n\ ^ . (11) 

^ As y \i-pzj \ 1 -/>«, 



This is again of the general form Eq. (A.l) discussed in Appendix |A|. Hence, the steady-state distribution 



of the species number S is P(nR, p; S) as defined by Eq. (|A.4| ). Figure |l| shows a typical distribution and 
corresponding simulation results. The curves agree well. Only the probability for S near zero seems to be 



overestimated by the theory. By Eq. ( A. 2 ), the mean number of species is 



(S) = (12) 
1-P 



and by Eq. ( A.3) the relative variance (varS 1 )/ (S) 2 = 1/ nRp. Typical relative variances (Tab. [l]) can 



become of the order unity. Thus, in the model, S fluctuates strongly on evolutionary time scales. 
3.2 Basic properties of clades 

The division of S into clades can be made more explicit. For example, the distribution of the number n of 
species in a single clade is given by Eq. @ conditional to n > 1: 

P n 

Vn = -„ln(l-p) (13) 
Thus, the mean number of species per clade is 

<■>-!>»■- (!_,) iLp-,, - <"» 

Further, the expectation value of the number of clades c in the food web can be estimated as 

(c) = ^1 = -kRIil(1-p). (15) 

(n) 

(An exact calculation yields the same result.) Since appearances and extinctions of clades are statistically 
independent, the number of clades is Poisson distributed. For typical values of (n) and (c) see Tab. [jj 

To obtain the average lifetime r c of a clade founded by a species with speed parameter s, notice that 
the probability that a clade exists in the interval [s, s + As] is 1 — po with po given by Eq. (||). On the 
other hand, new clades are founded at a rate r\<j As with a — exp(s). The fraction of time when there is a 
clade in the interval is thus r c r\a As. (Note that in the limit As — ► there is no overlap in the clade 
lifetimes.) Thus 

Tc = nm izf! = - In d-P). (16) 
As^o riuAs r + a 

The time that it takes for the system to reach the steady state can be estimated by the lifetime of the 
slowest clade, i.e., by Eq. ( [l6|) with a = exp(O) = 1. This quantity is important for model simulations. For 
a detailed discussion of the dynamics of the birth/death process relevant here, including the clade lifetime 



distribution, see the book of Bailey (1964) 



The typical number of evolutionary "generations" that a clade exists is 
^/(generation time) = r c r + a = — ln(l — p) (see Tab. [l] for typical values). Since in each generation the 
variance of the distribution of s over a clade increases by D, the width of a clade on the speed-parameter 
line is of the order 



stds w y/-Dln(l - p). (17) 

The assumption made above that all members of a clade have approximately the same s is justified when 
s < 1. 
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3.3 The expected directed connectance 

A food-web property that has found much attention in both empirical and theoretical research is the 
connectance, for example measured in terms of the directed connectance C = L/S 2 ( Martinez , 1991 ) with 
L denoting the total number of trophic links. To compute the expectation value of this quantity, note that 
from all S 2 topologically possible links only some are allometrically possible in the model, namely those 
from consumers i to their possible resources h with Sh > Si ~ XR (s. Sec. ||). A fraction (1 — A) 2 /2 of the 
Sh-Si plane is forbidden. By construction, exactly a fraction Co of all allometrically possible links is 
realized on the average in the model. Thus, as a simple estimate one gets 
S 2 [l - (1 - A) 2 /2] = S 2 (l + 2A - A 2 )/2 allometrically possible links and 



C«C (l + 2A-A 2 )/2. 



(18) 



The exact value differs due to subtle correlations stemming from intra-clade links. As an example, we 
derive C for the case that the typical intra-clade spread of s given by Eq. ([17]) is much smaller than Ai?, so 
that all intra-clade links are allometrically possible. As in Sec. 3.1, we divide the s axis into small intervals 



of width As, and do again as if each clade was located in its own interval. Let the p-th interval range from 
s p to s p + As — Sp+i and denote the number of species it contains by n p . We first compute the expected 
number of allometrically possible links conditional to fixed S 



{n p n q \S)= Y ( n P n q\S) +J2(n 2 p \S) 



(19) 



Sp>S q —XR 



P#9 
s p >s„—\R 



Consider the last term first. The distribution p n of n p is given by Eqs. (|6||8j). Since clades appear and 
disappear independently, the probability that there are S — n p species outside the p-th interval is, just as 
for the total number of species, P(kR, p;S — n p ), defined by Eq.(|36|) to lowest order in As. The probability 
for a particular pair (n p , S) is therefore p Up P(kR, p; S — n p ). This can be used to calculate the probability 
p(n p \S) of n p conditional to S in the usual way, giving 



s 



n=0 



n 2 p(n\S) = 22', 



P„P(kR, p; S — n) S (kR + S) 



As + 0(As 2 ). 



P(kR,p;S) R(1 + kR) 

The dependence on p drops out. By a similar argument one obtains to lowest order in As 



(n p n q \S) = n mp(m, n\S) = 



m+n<S 



k(S - 1)5 
R (1 + kR) 



As 2 . 



Inserting both results into ( |l9| ) and taking the limit As — > yields 

S + kR [l + i(l + 2A - A 2 ) (S - 1)] 



(L al \S) =5. 
=S 2 



1 + nR 
1 + kR^(1 + 2A — A 2 ) 
1 + kR 



o 



kR 
~S~ 



(20) 
(21) 

(22) 
(23) 



Expression (|23| ) is often a good approximation of (|22|). The expected directed connectance conditional to S 
is (CIS") = Co (Lni\S) /S 2 . Dropping the undefined case S = 0, the expected connectance for freely 
fluctuating S can be evaluated as 



(C) = C [1-P(kR,p;0)}- 1 



(L^\S) 
S 2 



P(nR,p; 5), 



(24) 



either directly numerically or, for a (complicated) closed-form expression, with the help of symbolic algebra 
software. For the parameters of Bridge Brook Lake (Tab. [I]), for which Ai?/stds = 14.7, Eq. ( p4| ) yields 
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(C) = 0.294 while the simple estimate @ gives (C) = 0.230. Simulations yield (C) = 0.286. The cases 
that XR > is much smaller than the typical intra-clade spread of s and that A = (no cannibalism) can 
be handled by replacing n 2 in Eq. ( p0| ) by n(n + l)/2 or n(n — l)/2 respectively. For both cases the 
approximation (C\S) = \ C [l + kR(1 + 2A - A 2 )]/(l + kR) + ©(S" 1 ) holds. For the parameters of St. 
Martin Island (A = 0) this yields (C) = 0.115, while numerically (C) = 0.112 is obtained. 



4 The distributions of generality and vulnerability 

In this Section, analytic approximations for the distributions of generality k (the number of resources of a 
consumer) and vulnerability m (the number of consumers of a resource) are derived. When defining the 
direction of trophic links in the standard way from the resource to the consumer, these are the 
distributions of the in-degree and the out-degree of the food web, respectively. Degree distributions are 
often thought to belong to the major determinants of the overall network topology. Due to the inherent 
randomness of food webs and their finite size, instances of degree distributions of empirical or model webs 
are also random quantities. Nevertheless, they contain information regarding the probability distributions 
of generality P gcn (£;) and vulnerability P vu \(m) in the steady state. Specifically, if N(k) denotes the 
number of species with generality A; in a web and the total number of species is S, then 
(N(k)/S) = Pgen(fc) in the steady state. While this is trivial for fixed S, it is worth noting that this 
relation is valid also when the value of S fluctuates randomly and when the generalities of individual 
species are strongly correlated with each other and with S, as can be seen by a straightforward calculation. 
Below it is shown that the conditional probability P gon (k\S), i.e. the conditional expectation value 
(N(k)/S\S), does in fact strongly depend on S. For a comparison with single instances of empirical 
distributions N(k)/S the conditional distribution P gcn (k\S) is therefore better suited than P gen (fc). Similar 
considerations hold for the vulnerabilities. Thus, the conditional distributions are computed below. 
Following pamacho ct al. (2002a), we consider the distinguished limit of large food- web sizes S and 



small connectances C while keeping the link density Z := L / S = CS fixed. (Fixing Z for asymptotic 
expansions is not meant to suggest that Z is actually fixed for large food webs.) For simplicity, we make 
use of the hypothesis that resources typically evolve faster than their consumers in the extreme form that 
resources evolve much faster than their consumers. This corresponds to assuming a large spread of time 
scales R and a small loopiness A. Errors due to intra-clade trophic links, which violate this hierarchy of 
timescales, are small when the total number of clades (|ll]) is large, due either to large kR or to small 1 — p. 
We note that in the case kR 3> 1 the combined effect of these assumptions would reduce the formula for 
the directed connectance derived above to (C) = Co/2, which shows that the approximations employed 
here are much coarser than those used in the forgoing Sections. Nevertheless they retain the main effects 
that determine the general forms of the degree distributions. 



4.1 Reduction to the dynamics of the actual resources 

When most resource species evolve much faster than their consumers, the distribution of generality for a 
given consumer can be approximated by the steady-state generality distribution with the consumer 
assumed fixed while its resources evolve. We first show that, using a simple mean- field- type approximation, 
the stochastic dynamics of the actual resources of the fixed consumer can be separated from the dynamics 
of the possible resources which are not actual resources (called spurned resources below) in a self-consistent 
way. 

To derive the dynamics of the actual resources, consider a small interval [s, s + ds] in the range of 
possible resources. Let a = exp(s). The rate at which actual resource species in the interval speciate in 
such a way that the descendant species remain actual resources is r* + <J with 

r* + = (l-l3)r + +pC Q r+. (25) 

The first term accounts for trophic links that do not break in the speciation, and the second term for 
trophic links that break but are immediately reconnected. The probability that a resource species becomes 
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extinct in a time interval of length dt is simply r*_ a dt with 

r*=r_. (26) 

Finally, the consumer can acquire a novel resource species either by an adaptation of a new species to the 
habitat or by a speciation of a spurned resource in such a way that the decedent species becomes an actual 
resource. For the rate at which the latter event occurs, a mean-field type approximation is employed: The 
number of spurned resources n° in the speed-parameter range [s, s + As] is approximated by its 
expectation value {n°). The rate at which a predator acquires novel resources (that did not speciate from 
an existing resource species) in this range is then given by Cor\ a As with 

.-; = ,, + ^± (27, 

As 

The first term represents new adaptations, the second term mutations of spurned species. With this 
approximation, the expectation value for the number n* of actual resources in the range [s, s + As] can be 
calculated as 

( n *) = ^ir As ( 28 ) 



by methods analogue to those used in Section 3.1. Deviations from this mean-field approximation occur 
because the expectation value (n°) is correlated to n* by the breaking of actual links, which occurs at a 
rate 0(f3). Since the contribution of (n°) to the dynamics of n* is also of order 0((3), the resulting error in 
the distribution of n* is O{0 2 ). 

For the dynamics of the number of spurned resource, a set of equations corresponding to Eqs. (|2J 
can be set up by replacing Co — » 1 — Co and interchanging the indices * and o ( p5f|28| ) . These equations can 
be used to eliminate (n°) from Eq. (p7|), yielding 

/?(1-C )r + 

r 1 = r\ -\ n. (29) 

r_ — r+ 

4.2 The generality distribution for fluctuating S 



Analogous to the calculations of Section 3.1, the cumulant generating function for the number of actual 



resources for a species with speed parameter s can now be obtained as 



/>".,.•»<*••-•) = Co/.- ! A(.s)ln ( i— £- ) . 130) 



where p* = r* + /r*_ and n* = r\jr\ are given by Eqs. (g5|]2J,|29|) , and A(s) = min[(l + A) R — s, R] w R — s 
is the size of the speed-parameter range of possible resources. The corresponding distribution function is 

P sen (s,k) = P(C K*A(s),p*;k) (31) 



as defined by Eq. (A. 4). In particular, the expected number of a consumer's resources is 



c k*A(s)p* n 

(k = — — =C A(s . 32 

1 — p* r- — r+ 

Comparison with Eq. ( |l0| ) shows that the mean-field approximation preserves the model property that, on 
the average, a fraction Co of all allometrically possible links is realized. Just as for the overall species pool, 
the diet of a consumer can be divided into several clades, each descending from a single newly acquired 
resource. For example, the expected number of resource clades for a consumer is 

-Co«*A(s)ln(l-p*), (33) 



10 




Figure 2: Steady-state generality distributions for fluctuating species number, calculated from Eq. ( p4[) with 
j3 = (solid) and (3 — 0.05 (dashed) in comparison with direct numerical simulations (circles, squares). The 
other parameters were R = lnlO 20 , D = 0.005, p = 0.95, K = 10/ R, C = 0.1, A = 10~ 3 (no fitting). The 
inset shows the same data on a double-logarithmic scales. 



in analogy to Eq. ([L5[) . 

Since, on the average, species are homogeneously distributed along s, the probability distribution 
-Pgcn(fc) of the generality of a species chosen arbitrarily from a food web is, to a good approximation, the 
average of P gen (s, k) over s. Analytically, this average is more easily calculated in terms of the moment 
generating function M gon (s, z) = J^k Pgea.{ s > zk — ex P K gen (s, z). For the simple case A = one obtains 

i r R u-i ( i - p* \ C ° K * R 

M gen (z) = - / M gen (s, z) ds = — — with u = ( £- ) . (34) 



R J Q log u \ 1 — p 

The generality distribution P gcn (k) itself can be calculated by a Taylor expansion of M sen (z) in z or 
numerically from the Fourier transformation of Re{M gcn (e"^)}. A comparison with direct numerical 



simulations shows that the condition that R is large is important for the numerical validity of Eq. (34). For 
example, Fig. |^ shows analytic and numerical results for R = In 10 20 in good agreement. 

4.3 The generality distribution for fixed S 

In order to compute the generality distribution P gen (k\S) conditional to fixed S, we start again from the 
distribution P gen (s, k\S) for a consumer with speed parameter s. In order to simplify the calculations /3 = 
is assumed here. Then k* = n° = k and p* — p° = p. 

For a given consumer, the species pool can be divided into three subsets: (i) the actual resources of the 
consumer, (ii) the allo metrically possible but spurned resources, and (iii) the allometrically forbidden 
resource (see Sec. |3~3[ ). For small enough D, each clade is located in a single subset, and the species 
distributions in the three subsets become independent. We first calculate the probability distribution for 
the number of species in the union of the sets (ii) and (iii) for freely fluctuating S. As above, denote the 
width of the range of allometrically possible resources on the s axis by A. The distribution of the species 
number in set (ii) can be obtained from Eq. (|3^) by substituting Co — > 1 — Co and is therefore given by 



P((l — Cq)kA, p;n) as defined in Eq. (A. 4). The distribution of the number of species in (iii) can be 



obtained in the same way as the distribution of the total number of species (Sec. 3.1), just that the relevant 
range of s is now R — A, and not R. Hence this distribution is given by P(k (J? — A), /?; n). The distribution 
of the number of species in the union of these two sets is given by the convolution 

Punicm(n) = P((l - Co)kA,p;ti)* P(k(R- A),p;n) = P(k(R- C A),p;n). (35) 
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The second equation is easily verified by comparing the corresponding cumulant generating functions (A.l). 

The number of species in set (i) is given by P gon (fc) = P(CqkA, p; fc) as defined above. Using the known 
distribution P(kR, p; S) for S, the conditional distribution of generality can be obtained as 



gen (k\S) 



Pgcn{k) P U nion{S fc) 

(36) 



T(C kA + fc) T(kR) T(l + S) T(k (R - C A) -k + S) 

~ t(c kA) r(i + k) t(k (r - c A)) r(i - * + s) t( k r + s) ' 

Remarkably, just as the conditional expectations Eq. (^0|) and (|2l|), this result is independent of p. The 
parameters S is playing a similar role instead (see below). Equation ( |36| ) is now evaluated for large S. 
Specifically, we assume (i) S ^> kR, which is natural when S is of the order of its expectation value 
nRp/ (1 — p) and 1 — p <C 1, (ii) S>1, (hi) we restrict ourselves to values of fc <C S, and (iv) in order to 
take the distinguished limit of fixed link density, we set Co = Z Q /S with fixed Z . Expanding the logarithm 
of Eq. (|3^) for large S (e.g., using Stirling's formula) then gives 

id men i kAZ ° , ~( kR ~ 1 ) k + kLZ o [7 + iPo(kR) + ipo(k) - In 5] 

lnP gen (A;|5) =ln-^- + + •■•, (37) 

where 7 ~ 0.57 is the Euler constant and ipo{ x ) — (d/dx) lnT(x) the digamma function. 



A similar expansion can be obtained for a distribution of the form P(CqkA, p; fc) given by Eq. (A. 4), 
when the parameter p is assumed to behave such that S — 6/(1 — p) with fixed b for large S, which is 
natural in view of (S) ~ 1/(1 — p). One obtains 

1 urn a ~ u\ 1 kAZ ° , -bk + KLZ [y + ]nb + ipo(k)-]nS] 

lnP(C KA,p;fc) = ln-^- H 1 . (38) 

A comparison of the two expansions shows that 

Pgen(s, k\S) w AAP(C kA( S ), p; fc), (39) 

where 

A/" = AA(s) = exp {C kA(s) [j (kR) - ln(«fl - 1)]} (40) 

and 

kR-1 

p = l-—g—. (41) 

Hence, apart from the new parameters AT and p, the form of the generality distribution for fixed S is 
approximately the same as for fluctuating S. 

The additional normalization factor Af enters because fc can never exceed S 1 , while P(CqkA, p; fc) is 
nonzero for all fc. When the expected number of consumers is much smaller than S, i.e., for small 
connectances Co, the value of M approaches 1. This can be seen by noting that 
70(a) - ln(a; - 1) = l/{2x) + G(x- 2 ), so that we can write N = exp[CA/(2i?)] with C « C . The 
dependence on S is fully contained in the new parameter p. Its relation to p can be understood by 
substituting S in Eq. ( [i"l| ) by (S) — nRp/{l — p), which leads to 

l _~ = KR_zl {1 _ p)w{1 _ p) (42) 

KRp 



Of course, the forgoing interpretation of Eq. ( p9[ ) makes sense only when kR > 1. Yet, Eq. fl39| ) is 
numerically valid also when continued analytically to the region < kR < 1 where p > 1. 



In Section 4.1 it was shown that the effect of a small, non-zero (3 can be approximated by a 



renormalization of the coefficients k and p. Equation (tS9j) shows that for f3 = the effect of fixing S is also 
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Figure 3: Steady-state generality distributions conditional to fixed species number S obtained from simula- 
tions with C = 0.1, S = 100 (•), C = 0.1, S = 300 (+), C = 0.5, S = 100 (o), and C = 0.5, S = 300 
( x ) in comparison with the corresponding predictions by Eq. (03) (solid) and by directly averaging Eq. ( |3q ) 
over A = 0..R (dashed). The other parameters were R = In 10 ,D = 0.005, p = 0.95, k = 10/ R, A = lCT^, 
/? = (no fitting). For all examples (S) = 190. The inset shows the same data on a double-logarithmic scale. 



essentially a renormalization of p. Even though the generality distribution for fixed S and non-zero (3 is 
difficult to compute analytically, it is reasonable to assume that this too can be approximated by an 
expression of the form (|39| ) with an appropriate pair of parameters k and p. 

In order to obtain the overall conditional generality distribution we go, again, over to the 
moment-generating function 

oo 

/ i _ - \ CqkA(s) 

M gon (s, z\S) := 2 F gcn (fc, z\S)z k « TV ( — 4- ) ■ (43) 

The average of this expression over s for the simple case A — > is 

M E en(z|S)^ with fi = exp (|) (^£) C ° KR . (44) 

This result was verified by comparison with a direct numerical simulations of the model. Figure || shows 
simulation results in comparison with the predictions of Eq. ( [l4| ) and with the results of numerically 
averaging Eq. ( |36| ) directly over A = 0..R. Although the precision of the approximation Eq. (|^) decreases 
for increasing Co and k in comparison with the prediction using Eq. (|3^), it is surprisingly good even for 
large values of Co and k. For large Co and small k the simulations deviate noticeably also from the 
prediction using Eq. (^), because in this parameter range the effects of intra-clade consumption, that had 
here been ignored, become relevant. Even for smaller i?, kR = 0(1), and f3 > 0, where Eq. ( |44| ) does not 
make quantitative predictions, the general form of this expression still seems to be valid. Figure |^ shows 
some examples of numerical results in this regime compared with curves obtained by fitting C, p and k in 
Eq. (g). The fitted curves describe the distributions similarly well as the quantitative predictions above: 
deviations occur many for very small and very large k. 
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10 20 

generality k 

Figure 4: Simulation results for the generality distributions conditional to S — 40 with Co = 0.1, j3 = 
(squares), Co = 0.1, (3 — 0.05 (circles), Co = 0.3, (i — 0.05 (triangles) compared to distributions fitted by 
adjusting the parameters C, p and k in Eq. ( |44| ) (dashed, solid, dotted line). The other parameters were 
R = lnlO 4 , D = 0.005, p = 0.95, k = 2/R, A = lO" 3 . 




vulnerability m 

Figure 5: Steady-state vulnerability distributions conditional to fixed species number S. Parameters are the 
same as in Fig. S. The solid and dashed lines correspond to Eqs. (^) and (fl5|), respectively. The inset shows 
the same data on a double-logarithmic scale. 
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4.4 The vulnerability distribution for fixed S 



The distribution of the vulnerability m is most easily computed directly conditional to fixed S: Assume 
species to be indexed in the order of increasing s starting with 1 . For A — > the number of possible 
consumers of species i is then simply i. When assuming again that resources evolve much faster than their 
consumers, the consumers of i are determined by (i) the random connection of consumers with probability 
Co when the resource-clade founder enteres the food web and (ii) random re-connections with probability 
Co during speciations of resources. Neither of this processes introduces correlations in the connectivities 
within the set of possible consumers of i. Thus, links are statistically independent and the vulnerability of i 
is given by a binomial distribution. Averaging over the food web yields 



P vu i(m|S) 



s 

E 

i—m 



c m (i-c r 



(45) 



This is exactly the expression that Camacho et aL (2002a) obtained in their analysis of the niche model. 
Following their observation that in the limit of large S with constant Z = CqS and i = O(S) the binomial 
distribution can be approximated as Poisson and the sum by an integral, one obtains 



i r z <> t m P - 
Z J Q ml 



-dt. 



(46) 



As is shown in Fig. [s|, this result predicts the vulnerability distribution similarly well as Eq. ( |44| ) the 
generality distribution. 

Note that the Poisson distribution entering Eq. (46) is the special case P(t/B, B;n), B — > of the 
general distribution P(A,B;n) entering Eq. (|39|). Thus, the integral (fl6|) is also a limiting case of the 
general form Eq. (Q). In the case of generality distributions, however, B is typically close to one. 



5 Comparison with other topological food-web models 



5.1 Comparison with the cascade model 

The main idea upon which the cascade model is based, random connections restricted by a trophic 
hierarchy, is retained in the speciation model, albeit refined in several ways. The cascade model is 
recovered from the speciation model in the limit of no loops (A = 0), and no speciations 2 , i.e. r + — > 0. 
Then all species enter the species pool by adaptations and are independently, randomly connected to their 
resources and consumers, just as it was assumed for the consumers alone in the foregoing section. However, 



the limit r + — > does not describe empirical data particularly well ( Rossberg et al. , 2005 ). Typical 



parameter sets for the speciation model have r + 



(Tab. p. 



5.2 Comparison with the niche model 
5.2.1 Degree distributions 

It was mentioned already that the distribution of vulnerability in the niche model is approximately the 
same as in the speciation model, in both cases given by Eq. (Q). In the case of the niche model 
Zq = 2CS = 2Z where the targeted conncctance C and the species number S are parameters of the model. 
In both cases the distribution is due to random connections with possible consumers. 



For the generality distribution the situation is more complex. As the analysis of Camacho et al. (2002a) 
showed, it is for the niche model essentially determined by the distribution of the "niche width", i.e., the 



size of the interval containing the resources of a species on the niche-parameter scale. Williams and 



Observe that for r+ — * the often encountered combination — Kln(l — p) simplifies to n/r- 
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Martinez ( 2000| ) chose this width for each species as its niche value n times a random variable x with a 



beta distribution of the form 

Px (x) = 6(1 - a;)^- 1 ) w be~ bx , (47) 

where 6 = (1 — 2 C r )/(2 C) depends on the targeted directed connectance C. The approximation by an 
exponential is valid for b 3> 1, i.e. for C f» 1/(26) <C 1. Williams and Martinez ( 200C| ) used this particular 



form for its computational simplicity. No ecological arguments to motivate it seem to be known. Since 
species are independently and evenly distributed with density S in the one-dimensional "niche space" , the 
number of species in the niche interval follows a Poisson distribution with expectation value Snx when x is 
fixed. Averaging over all x yields the geometric distribution 

P< r .» (n , t) _ jT &f e ~^ be -^ . _L_ (^fL-)'. ,48, 

The overall generality distribution is obtained by averaging Eq. (E3) over n. The calculation is 
simplified by the approximation k ~ Snx, i.e. 

P^Z hc) (n,k) » i-pj A) = J-cxp , (49) 



which is valid for nZ 3> 1 [cf. Eq. (Uq)]. This leads to the result of Camacho et al. (2002a) 



P£L chc) (k) = P g ( c T hc) («, k)dn « (50) 
Jo ^0 ^0 



with Ei(x) := J^ 00 i _1 exp(— t)<ii denoting the exponential integral function. Camacho et al. (2002b) 
concluded that the distribution of the scaled generality k/(2Z) or, for single instances of food webs more 
appropriate, its cumulative distribution, should have the universal form 

P >xj= J E!(x')dx' = exp(-ar) - xE 1 {x), (51) 

and verified this impressively by a comparison with empirical data. 

In order to see if this observed regularity is reproduced also by the speciation model, cumulative 
distribution functions for the speciation mode obtained from Eq. ( f44| ) were compared with Eq. (|5l|). The 
value for k = was excluded from the comparison because (i) in many empirical food-webs the lowest 
trophic level (fc = 0) is only poorly resolved and (ii) the approximation ( |50[ ) is undefined at k = and 
Eq.([44|) is not accurate at this point either. The scaling factor Z^ 1 for the generality and the correction M 
of the normalization constant were therefore determined directly by transforming the cumulative 
speciation-model distributions to A/"^^? =fc P sen (k' /Zq\S) such as to minimize the mean-least-square 
deviation from ( |5l|) for k > 1. These curves match Eq. ( pit) surprisingly well over a wide parameter range 
(Fig. |^a). The empirical data is described well by both distributions (Fig. ^|b). 

To understand the reason for this apparent scaling law of speciation-model food webs, consider the 
speciation-model generality distribution ([l4|) conditional to k > 1 in the limit of low connectance 
Cq,C — > (now at fixed S), i.e. the distribution with the moment generating function 

M gcn (z|S)-M gen (0|ff) _ ln(l-pz) 
Co ,c^o 1 - M ge „(0|5) ~ ln(l - p) ' { ' 

This is easily seen to be the distribution of resources-clade sizes [cf. Eq. (|l3|)] 

~k 

fcln(l-p)' (53) 
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Figure 6: Comparison of niche-model and speciation-model predictions for the cumulative generality dis- 
tribution, (a) The approximation ( pl4| ) for the speciation model with CqHR — 0.2, p — 0.75 (triangles), 
C kR = 1.5, p = 0.75 (circles), C a kR = 0.2, p = 0.98 (plus), C a kR = 1.5, p = 0.98 (dotted line) in 
comparison with the approximation (|T]) for the niche model (solid line), (b) The empirical distribution 
for Little Rock Lake Martinez ( 1991[ ) (dots), the speciation model prediction from numerical simulations 
(dashed, shaded area is the 1-er range of fluctuations before scaling), and again the approximation ( |5l| ) for 
the niche model (solid line). All distributions have the point k = removed and are scaled and normalized 
to minimize mean-square deviations from Eq. (Ell). 



In this limit of low connectance most species belong to the lowest trophic level, only a few heterotrophs 
remain, and the percolation of the network is lost. Therefore, this limit does not correspond to the general 
situation encountered in the field. But the approximate form of the log-series distribution ( p3| ) is retained 
also for more complex networks. For values of p ~ 0.8, this distribution has a shape quite similar to the 
exponential integral distribution Eq. (fl9|). When going over to cumulative distributions, the fit looks even 
better. Thus, the observed generality distributions can be interpreted mechanistically in terms of the 
steady-state distributions of evolutionary clade sizes, corrected for fixing S and trophic link breaking. This 
also suggests that the "scaling" distribution (|53| ) ~ fc -1 exp(— fclnp) or the more accurate result (Q) would 
rather be the adequate functional forms than the exponential integral function (ff9|). 

In spite of the similarities of the overall generality and vulnerability distributions, there are marked 
differences in the detailed predictions of the two models. Consider, for example, the generality distribution 
for species near the lower end of the trophic cascade, i.e., species with kA(s) <C 1 in the speciation model 
and n -C 1 in the niche model, that have at least one resource species (k > 1). For the speciation model 

) lead again to the clade-size distribution (|5^), while for the niche model Eq. ( [48| ) 
predicts 

(l-nZ )(nZ ) k - 1 . (54) 

Thus, for the niche model it is very probable that such a species has exactly one resource, whereas for the 
speciation model larger generalities can also be expected. An empirical test should be capable of 
distinguishing these two predictions. 



Eqs. p|) and (A. 4 



5.2.2 Intervality 



A major distinction of the niche model from the cascade model is the intervality it enforces upon the diets 
of consumers. While the degree of intervality obtained with the cascade model is typically too small 



compared with empirical data (Cohen et al., 199C), it is too large for the niche model (Cattin et al., 2004) 



Under certain conditions the speciation model can also produce a high degree of intervality. Consider some 
arbitrary ordering of clades, for example by the speed parameter of the founder species, and an ordering of 
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the species within each clade given by a traversal of the evolutionary tree 3 . For this ordering (which differs 
from an ordering by s) diets will form contiguous sets when (i) the average number of resources clades ( |33| ) 
is low, i.e., when most consumers have either one or no resource clade, and (ii) the probability that 
resources break out of a resource clade during the clade's lifetime is low. Then the set of a consumer's 
resources is usually simply the non-extinct part of an evolutionary subtree. The probability of resource 
beak-out is small when (3 x (resource clade size) x (clade lifetime in generations) is small, which, by 
arguments analogous to those used in Section 3.2, the case when 



/V/(i-p*)<i. 



(55) 



For typical model parameters we find that these two conditions are satisfied to some extent but not too 
well (Tab. |l| ), in accordance with expectations. Correspondingl y, the degree of i nterva lity £>dict (Cattirj 
et al. , 2004| ) of empiri cal data is re p roduc ed well by the model ( Rossberg et al. , 2005 ) 



We conclude with Cattin et al. ( 2004 ) that the larger-than-random intervality observed i n food webs 



may not so much result from a low dimensionality of the niche space, as has been proposed ( Cohen 
but rather reflects the importance of the phylogenetic history for the food-web structure 



1978) 



5.3 Comparison with the nested hierarchy model 

Just as for the niche model, the generality distribution for the nested hierarchy model is imposed "by 
hand" by specifying the distribution (^) and setting k w Snx. But the structure of the set of resources is 
determined by a more complex algorithm that has been designed in such a way that consumers and 
resources form groups (« clades), and consumers and resources from the same groups share resources and 
consumers, respectively. The algorithm is intended to mimic a structure that would result from a 
phylogenetic evolution of the web, without explicitly modeling this evolution. The speciation model 
achieves a similar effect by explicitly modeling the evolutionary dynamics. 



6 Variants of the speciation model 

Modeling complex ecological systems often requires difficult decisions with regards to which kinds of effects 
ought to be incorporated into a model and which can be ignored. Here, two variants of the speciation 
model are shortly discussed that include aspects of the real system that had been left out in the original 
model. For both variants, the analytic results derived in the previous sections remain valid without change. 

6.1 A variant with asymmetric link persistence 

In the analysis above it was assumed that consumer-resource links are statistically independent of the 
phylogenetic history of the consumers. If this assumption is valid, one may as well modify the model such 
as to choose all resources of a descendant species at random after its speciation, without affecting the 
analytic results obtained above. More generally, one might incorporate an asymmetry in the persistence (or 
reconnection probability) of links between consumers and resources in the following way: 

In the original form of the model, the connectivity of the descendant species was (randomly) re-assigned 
for a fraction (3 of all possible trophic links. In the asymmetric variant of the model, the connectivity from 
the descendant species to its consumers is re-assigned for a fraction (3 C of all possible consumers, and the 
connectivity to resources is re-assigned for a fraction /3 r of all possible resources, with j3 c ^ j3 T in general. 

In fact, there is no ecological reason to expect C = j3 T . A large difference between the values of f3 c and 
(3 r such as considered above {(3 C — (3 -C /3 r = 1) could be understood from the assumption that in the 

3 For example, the order given by the recursive algorithm list (A) defined as 

1. if A has not become extinct 

print A ; 

2. for all direct descendants B of A in order of appearance 

list(S); 
starting with list (clade founder). 
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competition between related species their sets of resources are much more important than their sets of 
consumers: In order to avoid competitive exclusion, related species need drastically different sets of 
resources (/3 C = 1), while there is only little evolutionary pressure for a descendant species to have a 
different set of consumers than its predecessor ((3 C <C 1). 

However, one might also argue that by the direct resource-consumer interaction alone. Then one could 
expect it to be advantageous for a descendant species to evade its predecessors consumers (large (3 C ), while 
maintaining its resources (small (3 r ). This would lead to the reverse relation between (5 C and (3 r . An 
empirical test to establish which of these two mechanisms is more relevant might be possible. 



6.2 A variant with quantitative link strength 

Topological food-web models are often criticised for ignoring the fact that the link strength in food webs, 



instead of being either 1 or 0, is in reality a continuous quantity (Berlow et al. , 2004). There is a simple 
way to incorporate continuously varying link strengths in the speciation model without affecting its 
statistical properties. 

Instead of assigning to each possible trophic link a connectivity of either and 1, quantify the strength 
of each possible link by are real number between and 1 . Where the connectivity was copied during 
speciations in the original model, the links strength is copied now. Where the connectivity was set to 1 
with probability Co and to otherwise, set the link strength to an appropriately distributed random 
number between and 1 now. For a characterization of the resulting food webs in terms of topological 
food-web statistics, count each link with strength larger than some threshold as present, and all other links 
as absent. That is, the thresholding of the link strength is just delayed to the time of the characterization. 
While this modification is straightforward for the speciation model, modifications of other topological 
models to postpone the thresholding of link strength might be possible, if at all, only at the price of 
increasing the model complexity. 

Of course, an evolution where the link strength either does not changes at all or is reset to a completely 
new random value is quite artificial. More natural it would be to vary the link strength by a small random 
amount at each evolutionary step. In such a model, link breaking and reconnecting events relative to some 
threshold (1 — Co) would be correlated. They would be concentrated at certain pairs of consumer and 
resources clades with link strength near the threshold. Further studies are required to understand what 
effect this would have on the overall network structure. 



7 Discussion and Outlook 



Besides improving the general understanding of the properties of the speciation model and their 
dependence on model parameters, a purpose of this work was also to show that the speciation model 
integrates the underlying ideas from previous, simpler models (see Section ||). The speciation model retains 
the trophic ordering of the cascade model. In fact, it contains the cascade model as a special case. By the 
interplay of speciations, extinctions, and adaptations of new species to the habitat, the speciation model 
reproduces three key features of the niche model and the nested hierarchy model at the same time: (1) the 
empirical distributions of generality, which in the niche model and similarly in the nested hierarchy model 
are obtained only by a special, ecologically unmoti vated choice of the niche-width dist r ibutio ns; (2) 
intervality, to the degree that is actually observed ( Cattin et al. , 2004 , Rossberg et al. , 2005| ); (3) the 
organization of resources into groups of related species that share consumers and vice versa. This unifying 
character of the speciation model is p robably the main reason for its high accuracy in reproducing 
empirical data ( Rossberg et al. , [2005| ) . 



The observed broad, log-series-like generality distributions have been traced back to, among others, a 
condition 1 — p <C 1. This means that the rate constant for speciations r + is numerically close to the rate 
constant for extinctions r_ . For any phylogcnetically closed system, a steady state always requires that 
extinction rates and speciation rates are equal, independent of the statistical details of the branching 
pattern. For the half-open system considered here, l-p<l implies that the contributions from foreign 
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adaptations to the species pool are small compared to the contribution from speciations. In fact, 1 — p 
directly equals the fraction of species in the food web that have entered by foreign adaptations. However, 
in order to obtain broad, left-skewed generality distributions, the independence of the speciation and 
extinction probability of a species from the actual size of its clade is also important. If, instead, large clades 
would notably favor extinctions and small clades speciations, clade size distributions would be dominated 
by a "typical" clade size, which would, in the model, also lead to a narrower generality distribution. In an 
analysis of paleontological time series Raup| (1991) applied a model for the size of genera identical to the 
model used here for the dynamics of clade sizes [Eqs. While, on the average, this model (with 

p = 0.996) reproduced the data well, the scatter in the paleontological data was larger than in the model. 
Raupj could explain this observation by assuming that the overall evolution rate varies over time. Since 



such a variation can also be described by a (random) nonlinear transformation of the time axis, it does not 
affect statistics that refer only to a particular moment in time, such as food-web structures. Thus our 
assumption of a simple birth/death process is supported by paleontological observations. 

As a direct consequence of this birth/death process, a characterization of food webs in terms of "clades" 
has been derived. Table [l] lists expectation values for characteristic quantities corresponding to some 
empirical food webs. It might be interesting to compare these results with the taxonomic structure of the 
actual empirical webs or the model dynamics with paleontological records. 

In Section |J it was shown that a correlation between the evolution rates and the trophic height leads to 
the observed asymmetry between generality and vulnerability distributions. However, in the present model 
this requires evolution rates spanning an unrealistically large range of about 20 orders of magnitude. We 
are currently evaluating a variant of the speciation model that achieves a similar effect without any 
differences in evolution rates by making not directly the trophic links hereditary but the properties of 
species determining link strengths. An asymmetry of the heredity between species-as-consumers and 



species- as-resources leads effectively to an asymmetry of the link persistence as described in Section 6.1 



above. Numerical results with the new model are promising, but analytically we understand it only in so 
far as it can be approximated by the speciation model, so that the analysis presented here remains valid. 
Details regarding the new model will be reported elsewhere. 

Our findings indicate that a food web's population dynamical stability and persistence are not as 
important determinants of its structure as is sometimes assumed. From a technical point of view, this is 
good news. It appears possible to obtain natural food-web structures without time-consuming population 
dynamical simulations. These food webs could then be investigated also with respect to the question how 
their structure affects population dynamical stability. 

In the course of this work, analytic approximations for several empirically testable predictions of the 
speciation model could be obtained. These include the average clade size (n), the number of clades (c) in 
the web, the age of clades in generations (speciation times) — ln(l — p), the average number of resource 
clades Eq. (33), and the generality distribution of consumers at low trophic levels Eq. (53). A careful 
comparison of the models discussed here and other food-web models with existing empirical data and new 
results from ongoing efforts in the field will reveal discrepancies and, hopefully, suggest new ideas to 
bringing us another step closer to understanding this fascinating aspect of life on earth. 
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A A family of distribution functions encountered in the analysis 
of the speciation model 

The analysis of the steady-state of a simple model of evolutionary dynamics (Sec. [O]) naturally leads to 
probability distributions p n for species number n with a cumulant generating function 

\ a ^p n z n = K A , B (z) = A]n(^L\ (A.l) 

n ^ ' 

where < A, < B < 1. 
From this, the mean 



. dK A . B (e u ) 

W = 



and variance 



da 



d 2 K A , B (e u ) 

var n = 



du 2 



AB 



AB 



(A.2) 



(A.3) 



can be calculated directly. The ratio (varn)/(n) is (1 — B) 1 times larger than for Poisson distributions. 
The distribution function itself is given by 

nl a o \ (1- B) A B n T{A + n) 
Pn = P(A,B;n) := T ^ )T{1 ± n) j - (A.4) 



This implies that the ratio of consecutive probabilities is 

p n+1 B(A + n) 



Pn 1 + n 



(A.5) 



In particular, the most probable value is n = whenever AB < 1. Since B < 1, this is always the case 
when A < 1. For A = 1 one gets exactly a geometric distribution 

p„ = (A.6) 



and for small A Eq. (A.4) simplifies to the log-series distribution 



[1 + A\n{\- B) + 0{A 2 ) forn = 0, 

Pn = < AB n , 2n , (A.7) 

1 1 0{A 2 ) otherwise. v ' 



n 

For small B a Poisson distribution is obtained: With fixed AB, 



7).! 



uniformly in n. Finally, whe n AB ^> 1 the distribution p„ can be approximated by a Gaussian with mean 
and variance given by Eqs. ( A.S , A.3| ). 
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